Part-of-speech tagging and chunk parsing of spoken Dutch using support vector machines

نویسنده

  • Luite Stegeman
چکیده

This paper describes the design and evaluation of a part-ofspeech tagger and chunk parser for spoken Dutch, using support vector machines. The data in the Corpus Gesproken Nederlands is split into smaller sub problems to obtain reasonable training and tagging speed using various kernel types. The tagger combines good accuracy with reasonable tagging speed. The chunk parser shows good accuracy, but suffers from low speed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

A Fast Boosting-based Learner for Feature-Rich Tagging and Chunking

Combination of features contributes to a significant improvement in accuracy on tasks such as part-of-speech (POS) tagging and text chunking, compared with using atomic features. However, selecting combination of features on learning with large-scale and feature-rich training data requires long training time. We propose a fast boosting-based algorithm for learning rules represented by combinati...

متن کامل

Improved Arabic Base Phrase Chunking with a new enriched POS tag set

Base Phrase Chunking (BPC) or shallow syntactic parsing is proving to be a task of interest to many natural language processing applications. In this paper, A BPC system is introduced that improves over state of the art performance in BPC using a new part of speech tag (POS) set. The new POS tag set, ERTS, reflects some of the morphological features specific to Modern Standard Arabic. ERTS expl...

متن کامل

Target Word Detection and Semantic Role Chunking using Support Vector Machines

In this paper, the automatic labeling of semantic roles in a sentence is considered as a chunking task. We define a semantic chunk as the sequence of words that fills a semantic role defined in a semantic frame. It is straightforward to convert chunking into a tagging task using one of several IOB representations. Using this representation each word is tagged with I, which means that the word i...

متن کامل

A Memory-Based Shallow Parser for Spoken Dutch

We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memory-based learners to be trained on examples of shallow parsing taken from the treebank, and act as a shallow parser after training. An overview is given of a modular memory-based learning approach to shallow parsing, c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006